Informed algorithms for sound source separation in enclosed reverberant environments
نویسنده
چکیده
While humans can separate a sound of interest amidst a cacophony of contending sounds in an echoic environment, machine-based methods lag behind in solving this task. This thesis thus aims at improving performance of audio separation algorithms when they are “informed” i.e. have access to source location information. These locations are assumed to be known a priori in this work, for example by video processing. Initially, a multi-microphone array based method combined with binary time-frequency masking is proposed. A robust least squares frequency invariant data independent beamformer designed with the location information is utilized to estimate the sources. To further enhance the estimated sources, binary time-frequency masking based post-processing is used but cepstral domain smoothing is required to mitigate musical noise. To tackle the under-determined case and further improve separation performance at higher reverberation times, a two-microphone based method which is inspired by human auditory processing and generates soft timefrequency masks is described. In this approach interaural level difference, interaural phase difference and mixing vectors are probabilistically modeled in the time-frequency domain and the model parameters are learned through the expectation-maximization (EM) algorithm. A direction vector is estimated for each source, using the location information, which is used as the mean parameter of the mixing vector model. Soft time-frequency masks are used to reconstruct the sources. A spatial covariance model is then in-
منابع مشابه
Exploiting the Self-Steering Capability of Blind Source Separation to Localize Two or More Sound Sources in Adverse Environments
Blind Source Separation (BSS) algorithms have often been interpreted as a set of blind adaptive beamformers. Although this interpretation does not entirely hold under realistic conditions, it gives some useful insights on the self-steering capacity of BSS techniques. Actually, while accurate source location information is usually necessary to steer a beamformer, BSS offers the possibility to re...
متن کاملBinaural Source Separation in Non-ideal Reverberant Environments
This paper proposes a framework for separating several speech sources in non-ideal, reverberant environments. A movable human dummy head residing in a normal office room is used to model the conditions humans experience when listening to complex auditory scenes. Before the source separation takes place the human dummy head explores the auditory scene and extracts characteristics the same way as...
متن کاملEstimation of fundamental frequency of reverberant speech by utilizing complex cepstrum analysis
This paper reports comparative evaluations of twelve typical methods of estimating fundamental frequency (F0) over huge speech-sound datasets in artificial reverberant environments. They involve several classic algorithms such as Cepstrum, AMDF, LPC, and modified autocorrelation algorithms. Other methods involve a few modern instantaneous amplitudeand/or frequency-based algorithms, such as STRA...
متن کاملIntegrating Monaural and Binaural Cues for Sound Localization and Segregation in Reverberant Environments
The problem of segregating a sound source of interest from an acoustic background has been extensively studied due to applications in hearing prostheses, robust speech/speaker recognition and audio information retrieval. Computational auditory scene analysis (CASA) approaches the segregation problem by utilizing grouping cues involved in the perceptual organization of sound by human listeners. ...
متن کاملA real-time blind source separation scheme and its application to reverberant and noisy acoustic environments
In this paper, we present an efficient real-time implementation of a broadband algorithm for blind source separation (BSS) of convolutive mixtures. A recently introduced generic BSS framework based on a matrix formulation allows simultaneous exploitation of nonwhiteness and nonstationarity of the source signals using second-order statistics. We demonstrate here that this general scheme leads to...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016